Reconstructing voices within the multiple-average-voice-model framework
نویسندگان
چکیده
Personalisation of voice output communication aids (VOCAs) allows to preserve the vocal identity of people suffering from speech disorders. This can be achieved by the adaptation of HMM-based speech synthesis systems using a small amount of adaptation data. When the voice has begun to deteriorate, reconstruction is still possible in the statistical domain by correcting the parameters of the models associated with the speech disorder. This can be done by substituting those with parameters from a donor’s voice, at risk of losing part of the identity of the patient. Recently, the Multiple-Average-Voice-Model (Multiple AVM) framework has been proposed for speaker adaptation. Adaptation is performed via interpolation into a speaker eigenspace spanned by the mean vectors of speaker-adapted AVMs which can be tuned to the individual speaker. In this paper, we present the benefits of this framework for voice reconstruction: it requires only a very small amount of adaptation data, interpolation can be performed in a clean speech eigenspace and the resulting voice can be easily fine-tuned by acting on the interpolation weights. We illustrate our points with a subjective assessment of the reconstructed voice.
منابع مشابه
MARY TTS unit selection and HMM-based voices
This paper describes the implementation of a unit selection English voice and a HMM-based Hindi voice for our participation in the Blizzard Challenge 2013. The two voices have been created using the MARY TTS voice building framework. We describe how audiobook data is used to create the English voice and how a quality controlmeasure (statisticalmodel cost) is used to control the selection of uni...
متن کاملImmediate effects of vocal warm-up exercises on elementary teachers' voice
Introduction: Teachers are a large group of professional voice users who are exposed to many voice problems. Vocal warm-up exercises (VWUE) can prepare the muscles involved in vocalization before teaching and can reduce voice damage in teachers. However, limited studies have examined the effects of VWUE on teachers' voices. Therefore, the present study was conducted to investigate the immediate...
متن کاملSynthesis using Speaker Adaptation from Speech Recognition DB
This paper deals with the creation of multiple voices from a Hidden Markov Model based speech synthesis system (HTS). More than 150 Catalan synthetic voices were built using Hidden Markov Models (HMM) and speaker adaptation techniques. Training data for building a Speaker-Independent (SI) model were selected from both a general purpose speech synthesis database (FestCat;) and a database designe...
متن کاملVISA: The Voice Integration/Segregation Algorithm
Listeners are capable to perceive multiple voices in music. Adopting a perceptual view of musical ‘voice’ that corresponds to the notion of auditory stream, a computational model is developed that splits musical scores (symbolic musical data) into different voices. A single ‘voice’ may consist of more than one synchronous notes that are perceived as belonging to the same auditory stream; in thi...
متن کاملFamiliarity and Voice Representation: From Acoustic-Based Representation to Voice Averages
The ability to recognize an individual from their voice is a widespread ability with a long evolutionary history. Yet, the perceptual representation of familiar voices is ill-defined. In two experiments, we explored the neuropsychological processes involved in the perception of voice identity. We specifically explored the hypothesis that familiar voices (trained-to-familiar (Experiment 1), and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015